Introduction

Previously in a Kaggle Kernel, I discovered the Class Activate Map (CAM) using a VVG16. In this Notebook, I'll explore the CAM of Inception V3 in the prediction of few images (including the discovery from University of Tubingen regarding the imapct of the texture on the prediction (link))

In [1]:
import numpy as np

import tensorflow as tf
from tensorflow.keras.applications.inception_v3 import decode_predictions

import matplotlib.pyplot as plt

from scipy.stats import entropy
from numpy.linalg import norm

from PIL import Image
import cv2

Dataset preparation

Very simple step here. Let's load some images, resize them to 299x299 and scale them in a range of [0, 1]

In [2]:
test_image1 = Image.open("boat1.jpg")
test_image2 = Image.open("boat2.jpg")
test_image3 = Image.open("boat3.jpg")
test_image4 = Image.open("cat.jpg")
test_image5 = Image.open("voiture.jpg")
test_image6 = Image.open("voiture2.jpg")
In [3]:
test_image1 = test_image1.resize((299, 299))
test_image2 = test_image2.resize((299, 299))
test_image3 = test_image3.resize((299, 299))
test_image4 = test_image4.resize((299, 299))
test_image5 = test_image5.resize((299, 299))
test_image6 = test_image6.resize((299, 299))
In [4]:
fig, axes = plt.subplots(2, 3, figsize=(20,12))
axes[0, 0].imshow(test_image1)
axes[0, 1].imshow(test_image2)
axes[0, 2].imshow(test_image3)
axes[1, 0].imshow(test_image4)
axes[1, 1].imshow(test_image5)
axes[1, 2].imshow(test_image6)
plt.show()
In [5]:
X = np.zeros(shape=(6, 299, 299, 3), dtype=np.float32)
X[0] = np.array(test_image1)
X[1] = np.array(test_image2)
X[2] = np.array(test_image3)
X[3] = np.array(test_image4)
X[4] = np.array(test_image5)
X[5] = np.array(test_image6)
X /= 255.0

full prediction

We can now load the pre-trained model and make the classifications

In [6]:
IMG_SHAPE = (299, 299, 3)

base_model = tf.keras.applications.InceptionV3(input_shape=IMG_SHAPE,
                                               include_top=True, 
                                               weights='imagenet')
WARNING:tensorflow:From C:\python36\envs\machine_learning\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
In [7]:
base_model.trainable = False
base_model.summary()
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 299, 299, 3)  0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 149, 149, 32) 864         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization_v1 (BatchNo (None, 149, 149, 32) 96          conv2d[0][0]                     
__________________________________________________________________________________________________
activation (Activation)         (None, 149, 149, 32) 0           batch_normalization_v1[0][0]     
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 147, 147, 32) 9216        activation[0][0]                 
__________________________________________________________________________________________________
batch_normalization_v1_1 (Batch (None, 147, 147, 32) 96          conv2d_1[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 147, 147, 32) 0           batch_normalization_v1_1[0][0]   
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 147, 147, 64) 18432       activation_1[0][0]               
__________________________________________________________________________________________________
batch_normalization_v1_2 (Batch (None, 147, 147, 64) 192         conv2d_2[0][0]                   
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 147, 147, 64) 0           batch_normalization_v1_2[0][0]   
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 73, 73, 64)   0           activation_2[0][0]               
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 73, 73, 80)   5120        max_pooling2d[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_3 (Batch (None, 73, 73, 80)   240         conv2d_3[0][0]                   
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 73, 73, 80)   0           batch_normalization_v1_3[0][0]   
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 71, 71, 192)  138240      activation_3[0][0]               
__________________________________________________________________________________________________
batch_normalization_v1_4 (Batch (None, 71, 71, 192)  576         conv2d_4[0][0]                   
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 71, 71, 192)  0           batch_normalization_v1_4[0][0]   
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 35, 35, 192)  0           activation_4[0][0]               
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 35, 35, 64)   12288       max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
batch_normalization_v1_8 (Batch (None, 35, 35, 64)   192         conv2d_8[0][0]                   
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 35, 35, 64)   0           batch_normalization_v1_8[0][0]   
__________________________________________________________________________________________________
conv2d_6 (Conv2D)               (None, 35, 35, 48)   9216        max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 35, 35, 96)   55296       activation_8[0][0]               
__________________________________________________________________________________________________
batch_normalization_v1_6 (Batch (None, 35, 35, 48)   144         conv2d_6[0][0]                   
__________________________________________________________________________________________________
batch_normalization_v1_9 (Batch (None, 35, 35, 96)   288         conv2d_9[0][0]                   
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 35, 35, 48)   0           batch_normalization_v1_6[0][0]   
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 35, 35, 96)   0           batch_normalization_v1_9[0][0]   
__________________________________________________________________________________________________
average_pooling2d (AveragePooli (None, 35, 35, 192)  0           max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 35, 35, 64)   12288       max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 35, 35, 64)   76800       activation_6[0][0]               
__________________________________________________________________________________________________
conv2d_10 (Conv2D)              (None, 35, 35, 96)   82944       activation_9[0][0]               
__________________________________________________________________________________________________
conv2d_11 (Conv2D)              (None, 35, 35, 32)   6144        average_pooling2d[0][0]          
__________________________________________________________________________________________________
batch_normalization_v1_5 (Batch (None, 35, 35, 64)   192         conv2d_5[0][0]                   
__________________________________________________________________________________________________
batch_normalization_v1_7 (Batch (None, 35, 35, 64)   192         conv2d_7[0][0]                   
__________________________________________________________________________________________________
batch_normalization_v1_10 (Batc (None, 35, 35, 96)   288         conv2d_10[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_11 (Batc (None, 35, 35, 32)   96          conv2d_11[0][0]                  
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 35, 35, 64)   0           batch_normalization_v1_5[0][0]   
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 35, 35, 64)   0           batch_normalization_v1_7[0][0]   
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 35, 35, 96)   0           batch_normalization_v1_10[0][0]  
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 35, 35, 32)   0           batch_normalization_v1_11[0][0]  
__________________________________________________________________________________________________
mixed0 (Concatenate)            (None, 35, 35, 256)  0           activation_5[0][0]               
                                                                 activation_7[0][0]               
                                                                 activation_10[0][0]              
                                                                 activation_11[0][0]              
__________________________________________________________________________________________________
conv2d_15 (Conv2D)              (None, 35, 35, 64)   16384       mixed0[0][0]                     
__________________________________________________________________________________________________
batch_normalization_v1_15 (Batc (None, 35, 35, 64)   192         conv2d_15[0][0]                  
__________________________________________________________________________________________________
activation_15 (Activation)      (None, 35, 35, 64)   0           batch_normalization_v1_15[0][0]  
__________________________________________________________________________________________________
conv2d_13 (Conv2D)              (None, 35, 35, 48)   12288       mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_16 (Conv2D)              (None, 35, 35, 96)   55296       activation_15[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_13 (Batc (None, 35, 35, 48)   144         conv2d_13[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_16 (Batc (None, 35, 35, 96)   288         conv2d_16[0][0]                  
__________________________________________________________________________________________________
activation_13 (Activation)      (None, 35, 35, 48)   0           batch_normalization_v1_13[0][0]  
__________________________________________________________________________________________________
activation_16 (Activation)      (None, 35, 35, 96)   0           batch_normalization_v1_16[0][0]  
__________________________________________________________________________________________________
average_pooling2d_1 (AveragePoo (None, 35, 35, 256)  0           mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_12 (Conv2D)              (None, 35, 35, 64)   16384       mixed0[0][0]                     
__________________________________________________________________________________________________
conv2d_14 (Conv2D)              (None, 35, 35, 64)   76800       activation_13[0][0]              
__________________________________________________________________________________________________
conv2d_17 (Conv2D)              (None, 35, 35, 96)   82944       activation_16[0][0]              
__________________________________________________________________________________________________
conv2d_18 (Conv2D)              (None, 35, 35, 64)   16384       average_pooling2d_1[0][0]        
__________________________________________________________________________________________________
batch_normalization_v1_12 (Batc (None, 35, 35, 64)   192         conv2d_12[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_14 (Batc (None, 35, 35, 64)   192         conv2d_14[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_17 (Batc (None, 35, 35, 96)   288         conv2d_17[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_18 (Batc (None, 35, 35, 64)   192         conv2d_18[0][0]                  
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 35, 35, 64)   0           batch_normalization_v1_12[0][0]  
__________________________________________________________________________________________________
activation_14 (Activation)      (None, 35, 35, 64)   0           batch_normalization_v1_14[0][0]  
__________________________________________________________________________________________________
activation_17 (Activation)      (None, 35, 35, 96)   0           batch_normalization_v1_17[0][0]  
__________________________________________________________________________________________________
activation_18 (Activation)      (None, 35, 35, 64)   0           batch_normalization_v1_18[0][0]  
__________________________________________________________________________________________________
mixed1 (Concatenate)            (None, 35, 35, 288)  0           activation_12[0][0]              
                                                                 activation_14[0][0]              
                                                                 activation_17[0][0]              
                                                                 activation_18[0][0]              
__________________________________________________________________________________________________
conv2d_22 (Conv2D)              (None, 35, 35, 64)   18432       mixed1[0][0]                     
__________________________________________________________________________________________________
batch_normalization_v1_22 (Batc (None, 35, 35, 64)   192         conv2d_22[0][0]                  
__________________________________________________________________________________________________
activation_22 (Activation)      (None, 35, 35, 64)   0           batch_normalization_v1_22[0][0]  
__________________________________________________________________________________________________
conv2d_20 (Conv2D)              (None, 35, 35, 48)   13824       mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_23 (Conv2D)              (None, 35, 35, 96)   55296       activation_22[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_20 (Batc (None, 35, 35, 48)   144         conv2d_20[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_23 (Batc (None, 35, 35, 96)   288         conv2d_23[0][0]                  
__________________________________________________________________________________________________
activation_20 (Activation)      (None, 35, 35, 48)   0           batch_normalization_v1_20[0][0]  
__________________________________________________________________________________________________
activation_23 (Activation)      (None, 35, 35, 96)   0           batch_normalization_v1_23[0][0]  
__________________________________________________________________________________________________
average_pooling2d_2 (AveragePoo (None, 35, 35, 288)  0           mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_19 (Conv2D)              (None, 35, 35, 64)   18432       mixed1[0][0]                     
__________________________________________________________________________________________________
conv2d_21 (Conv2D)              (None, 35, 35, 64)   76800       activation_20[0][0]              
__________________________________________________________________________________________________
conv2d_24 (Conv2D)              (None, 35, 35, 96)   82944       activation_23[0][0]              
__________________________________________________________________________________________________
conv2d_25 (Conv2D)              (None, 35, 35, 64)   18432       average_pooling2d_2[0][0]        
__________________________________________________________________________________________________
batch_normalization_v1_19 (Batc (None, 35, 35, 64)   192         conv2d_19[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_21 (Batc (None, 35, 35, 64)   192         conv2d_21[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_24 (Batc (None, 35, 35, 96)   288         conv2d_24[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_25 (Batc (None, 35, 35, 64)   192         conv2d_25[0][0]                  
__________________________________________________________________________________________________
activation_19 (Activation)      (None, 35, 35, 64)   0           batch_normalization_v1_19[0][0]  
__________________________________________________________________________________________________
activation_21 (Activation)      (None, 35, 35, 64)   0           batch_normalization_v1_21[0][0]  
__________________________________________________________________________________________________
activation_24 (Activation)      (None, 35, 35, 96)   0           batch_normalization_v1_24[0][0]  
__________________________________________________________________________________________________
activation_25 (Activation)      (None, 35, 35, 64)   0           batch_normalization_v1_25[0][0]  
__________________________________________________________________________________________________
mixed2 (Concatenate)            (None, 35, 35, 288)  0           activation_19[0][0]              
                                                                 activation_21[0][0]              
                                                                 activation_24[0][0]              
                                                                 activation_25[0][0]              
__________________________________________________________________________________________________
conv2d_27 (Conv2D)              (None, 35, 35, 64)   18432       mixed2[0][0]                     
__________________________________________________________________________________________________
batch_normalization_v1_27 (Batc (None, 35, 35, 64)   192         conv2d_27[0][0]                  
__________________________________________________________________________________________________
activation_27 (Activation)      (None, 35, 35, 64)   0           batch_normalization_v1_27[0][0]  
__________________________________________________________________________________________________
conv2d_28 (Conv2D)              (None, 35, 35, 96)   55296       activation_27[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_28 (Batc (None, 35, 35, 96)   288         conv2d_28[0][0]                  
__________________________________________________________________________________________________
activation_28 (Activation)      (None, 35, 35, 96)   0           batch_normalization_v1_28[0][0]  
__________________________________________________________________________________________________
conv2d_26 (Conv2D)              (None, 17, 17, 384)  995328      mixed2[0][0]                     
__________________________________________________________________________________________________
conv2d_29 (Conv2D)              (None, 17, 17, 96)   82944       activation_28[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_26 (Batc (None, 17, 17, 384)  1152        conv2d_26[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_29 (Batc (None, 17, 17, 96)   288         conv2d_29[0][0]                  
__________________________________________________________________________________________________
activation_26 (Activation)      (None, 17, 17, 384)  0           batch_normalization_v1_26[0][0]  
__________________________________________________________________________________________________
activation_29 (Activation)      (None, 17, 17, 96)   0           batch_normalization_v1_29[0][0]  
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 17, 17, 288)  0           mixed2[0][0]                     
__________________________________________________________________________________________________
mixed3 (Concatenate)            (None, 17, 17, 768)  0           activation_26[0][0]              
                                                                 activation_29[0][0]              
                                                                 max_pooling2d_2[0][0]            
__________________________________________________________________________________________________
conv2d_34 (Conv2D)              (None, 17, 17, 128)  98304       mixed3[0][0]                     
__________________________________________________________________________________________________
batch_normalization_v1_34 (Batc (None, 17, 17, 128)  384         conv2d_34[0][0]                  
__________________________________________________________________________________________________
activation_34 (Activation)      (None, 17, 17, 128)  0           batch_normalization_v1_34[0][0]  
__________________________________________________________________________________________________
conv2d_35 (Conv2D)              (None, 17, 17, 128)  114688      activation_34[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_35 (Batc (None, 17, 17, 128)  384         conv2d_35[0][0]                  
__________________________________________________________________________________________________
activation_35 (Activation)      (None, 17, 17, 128)  0           batch_normalization_v1_35[0][0]  
__________________________________________________________________________________________________
conv2d_31 (Conv2D)              (None, 17, 17, 128)  98304       mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_36 (Conv2D)              (None, 17, 17, 128)  114688      activation_35[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_31 (Batc (None, 17, 17, 128)  384         conv2d_31[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_36 (Batc (None, 17, 17, 128)  384         conv2d_36[0][0]                  
__________________________________________________________________________________________________
activation_31 (Activation)      (None, 17, 17, 128)  0           batch_normalization_v1_31[0][0]  
__________________________________________________________________________________________________
activation_36 (Activation)      (None, 17, 17, 128)  0           batch_normalization_v1_36[0][0]  
__________________________________________________________________________________________________
conv2d_32 (Conv2D)              (None, 17, 17, 128)  114688      activation_31[0][0]              
__________________________________________________________________________________________________
conv2d_37 (Conv2D)              (None, 17, 17, 128)  114688      activation_36[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_32 (Batc (None, 17, 17, 128)  384         conv2d_32[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_37 (Batc (None, 17, 17, 128)  384         conv2d_37[0][0]                  
__________________________________________________________________________________________________
activation_32 (Activation)      (None, 17, 17, 128)  0           batch_normalization_v1_32[0][0]  
__________________________________________________________________________________________________
activation_37 (Activation)      (None, 17, 17, 128)  0           batch_normalization_v1_37[0][0]  
__________________________________________________________________________________________________
average_pooling2d_3 (AveragePoo (None, 17, 17, 768)  0           mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_30 (Conv2D)              (None, 17, 17, 192)  147456      mixed3[0][0]                     
__________________________________________________________________________________________________
conv2d_33 (Conv2D)              (None, 17, 17, 192)  172032      activation_32[0][0]              
__________________________________________________________________________________________________
conv2d_38 (Conv2D)              (None, 17, 17, 192)  172032      activation_37[0][0]              
__________________________________________________________________________________________________
conv2d_39 (Conv2D)              (None, 17, 17, 192)  147456      average_pooling2d_3[0][0]        
__________________________________________________________________________________________________
batch_normalization_v1_30 (Batc (None, 17, 17, 192)  576         conv2d_30[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_33 (Batc (None, 17, 17, 192)  576         conv2d_33[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_38 (Batc (None, 17, 17, 192)  576         conv2d_38[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_39 (Batc (None, 17, 17, 192)  576         conv2d_39[0][0]                  
__________________________________________________________________________________________________
activation_30 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_30[0][0]  
__________________________________________________________________________________________________
activation_33 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_33[0][0]  
__________________________________________________________________________________________________
activation_38 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_38[0][0]  
__________________________________________________________________________________________________
activation_39 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_39[0][0]  
__________________________________________________________________________________________________
mixed4 (Concatenate)            (None, 17, 17, 768)  0           activation_30[0][0]              
                                                                 activation_33[0][0]              
                                                                 activation_38[0][0]              
                                                                 activation_39[0][0]              
__________________________________________________________________________________________________
conv2d_44 (Conv2D)              (None, 17, 17, 160)  122880      mixed4[0][0]                     
__________________________________________________________________________________________________
batch_normalization_v1_44 (Batc (None, 17, 17, 160)  480         conv2d_44[0][0]                  
__________________________________________________________________________________________________
activation_44 (Activation)      (None, 17, 17, 160)  0           batch_normalization_v1_44[0][0]  
__________________________________________________________________________________________________
conv2d_45 (Conv2D)              (None, 17, 17, 160)  179200      activation_44[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_45 (Batc (None, 17, 17, 160)  480         conv2d_45[0][0]                  
__________________________________________________________________________________________________
activation_45 (Activation)      (None, 17, 17, 160)  0           batch_normalization_v1_45[0][0]  
__________________________________________________________________________________________________
conv2d_41 (Conv2D)              (None, 17, 17, 160)  122880      mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_46 (Conv2D)              (None, 17, 17, 160)  179200      activation_45[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_41 (Batc (None, 17, 17, 160)  480         conv2d_41[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_46 (Batc (None, 17, 17, 160)  480         conv2d_46[0][0]                  
__________________________________________________________________________________________________
activation_41 (Activation)      (None, 17, 17, 160)  0           batch_normalization_v1_41[0][0]  
__________________________________________________________________________________________________
activation_46 (Activation)      (None, 17, 17, 160)  0           batch_normalization_v1_46[0][0]  
__________________________________________________________________________________________________
conv2d_42 (Conv2D)              (None, 17, 17, 160)  179200      activation_41[0][0]              
__________________________________________________________________________________________________
conv2d_47 (Conv2D)              (None, 17, 17, 160)  179200      activation_46[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_42 (Batc (None, 17, 17, 160)  480         conv2d_42[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_47 (Batc (None, 17, 17, 160)  480         conv2d_47[0][0]                  
__________________________________________________________________________________________________
activation_42 (Activation)      (None, 17, 17, 160)  0           batch_normalization_v1_42[0][0]  
__________________________________________________________________________________________________
activation_47 (Activation)      (None, 17, 17, 160)  0           batch_normalization_v1_47[0][0]  
__________________________________________________________________________________________________
average_pooling2d_4 (AveragePoo (None, 17, 17, 768)  0           mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_40 (Conv2D)              (None, 17, 17, 192)  147456      mixed4[0][0]                     
__________________________________________________________________________________________________
conv2d_43 (Conv2D)              (None, 17, 17, 192)  215040      activation_42[0][0]              
__________________________________________________________________________________________________
conv2d_48 (Conv2D)              (None, 17, 17, 192)  215040      activation_47[0][0]              
__________________________________________________________________________________________________
conv2d_49 (Conv2D)              (None, 17, 17, 192)  147456      average_pooling2d_4[0][0]        
__________________________________________________________________________________________________
batch_normalization_v1_40 (Batc (None, 17, 17, 192)  576         conv2d_40[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_43 (Batc (None, 17, 17, 192)  576         conv2d_43[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_48 (Batc (None, 17, 17, 192)  576         conv2d_48[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_49 (Batc (None, 17, 17, 192)  576         conv2d_49[0][0]                  
__________________________________________________________________________________________________
activation_40 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_40[0][0]  
__________________________________________________________________________________________________
activation_43 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_43[0][0]  
__________________________________________________________________________________________________
activation_48 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_48[0][0]  
__________________________________________________________________________________________________
activation_49 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_49[0][0]  
__________________________________________________________________________________________________
mixed5 (Concatenate)            (None, 17, 17, 768)  0           activation_40[0][0]              
                                                                 activation_43[0][0]              
                                                                 activation_48[0][0]              
                                                                 activation_49[0][0]              
__________________________________________________________________________________________________
conv2d_54 (Conv2D)              (None, 17, 17, 160)  122880      mixed5[0][0]                     
__________________________________________________________________________________________________
batch_normalization_v1_54 (Batc (None, 17, 17, 160)  480         conv2d_54[0][0]                  
__________________________________________________________________________________________________
activation_54 (Activation)      (None, 17, 17, 160)  0           batch_normalization_v1_54[0][0]  
__________________________________________________________________________________________________
conv2d_55 (Conv2D)              (None, 17, 17, 160)  179200      activation_54[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_55 (Batc (None, 17, 17, 160)  480         conv2d_55[0][0]                  
__________________________________________________________________________________________________
activation_55 (Activation)      (None, 17, 17, 160)  0           batch_normalization_v1_55[0][0]  
__________________________________________________________________________________________________
conv2d_51 (Conv2D)              (None, 17, 17, 160)  122880      mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_56 (Conv2D)              (None, 17, 17, 160)  179200      activation_55[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_51 (Batc (None, 17, 17, 160)  480         conv2d_51[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_56 (Batc (None, 17, 17, 160)  480         conv2d_56[0][0]                  
__________________________________________________________________________________________________
activation_51 (Activation)      (None, 17, 17, 160)  0           batch_normalization_v1_51[0][0]  
__________________________________________________________________________________________________
activation_56 (Activation)      (None, 17, 17, 160)  0           batch_normalization_v1_56[0][0]  
__________________________________________________________________________________________________
conv2d_52 (Conv2D)              (None, 17, 17, 160)  179200      activation_51[0][0]              
__________________________________________________________________________________________________
conv2d_57 (Conv2D)              (None, 17, 17, 160)  179200      activation_56[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_52 (Batc (None, 17, 17, 160)  480         conv2d_52[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_57 (Batc (None, 17, 17, 160)  480         conv2d_57[0][0]                  
__________________________________________________________________________________________________
activation_52 (Activation)      (None, 17, 17, 160)  0           batch_normalization_v1_52[0][0]  
__________________________________________________________________________________________________
activation_57 (Activation)      (None, 17, 17, 160)  0           batch_normalization_v1_57[0][0]  
__________________________________________________________________________________________________
average_pooling2d_5 (AveragePoo (None, 17, 17, 768)  0           mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_50 (Conv2D)              (None, 17, 17, 192)  147456      mixed5[0][0]                     
__________________________________________________________________________________________________
conv2d_53 (Conv2D)              (None, 17, 17, 192)  215040      activation_52[0][0]              
__________________________________________________________________________________________________
conv2d_58 (Conv2D)              (None, 17, 17, 192)  215040      activation_57[0][0]              
__________________________________________________________________________________________________
conv2d_59 (Conv2D)              (None, 17, 17, 192)  147456      average_pooling2d_5[0][0]        
__________________________________________________________________________________________________
batch_normalization_v1_50 (Batc (None, 17, 17, 192)  576         conv2d_50[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_53 (Batc (None, 17, 17, 192)  576         conv2d_53[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_58 (Batc (None, 17, 17, 192)  576         conv2d_58[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_59 (Batc (None, 17, 17, 192)  576         conv2d_59[0][0]                  
__________________________________________________________________________________________________
activation_50 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_50[0][0]  
__________________________________________________________________________________________________
activation_53 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_53[0][0]  
__________________________________________________________________________________________________
activation_58 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_58[0][0]  
__________________________________________________________________________________________________
activation_59 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_59[0][0]  
__________________________________________________________________________________________________
mixed6 (Concatenate)            (None, 17, 17, 768)  0           activation_50[0][0]              
                                                                 activation_53[0][0]              
                                                                 activation_58[0][0]              
                                                                 activation_59[0][0]              
__________________________________________________________________________________________________
conv2d_64 (Conv2D)              (None, 17, 17, 192)  147456      mixed6[0][0]                     
__________________________________________________________________________________________________
batch_normalization_v1_64 (Batc (None, 17, 17, 192)  576         conv2d_64[0][0]                  
__________________________________________________________________________________________________
activation_64 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_64[0][0]  
__________________________________________________________________________________________________
conv2d_65 (Conv2D)              (None, 17, 17, 192)  258048      activation_64[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_65 (Batc (None, 17, 17, 192)  576         conv2d_65[0][0]                  
__________________________________________________________________________________________________
activation_65 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_65[0][0]  
__________________________________________________________________________________________________
conv2d_61 (Conv2D)              (None, 17, 17, 192)  147456      mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_66 (Conv2D)              (None, 17, 17, 192)  258048      activation_65[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_61 (Batc (None, 17, 17, 192)  576         conv2d_61[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_66 (Batc (None, 17, 17, 192)  576         conv2d_66[0][0]                  
__________________________________________________________________________________________________
activation_61 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_61[0][0]  
__________________________________________________________________________________________________
activation_66 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_66[0][0]  
__________________________________________________________________________________________________
conv2d_62 (Conv2D)              (None, 17, 17, 192)  258048      activation_61[0][0]              
__________________________________________________________________________________________________
conv2d_67 (Conv2D)              (None, 17, 17, 192)  258048      activation_66[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_62 (Batc (None, 17, 17, 192)  576         conv2d_62[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_67 (Batc (None, 17, 17, 192)  576         conv2d_67[0][0]                  
__________________________________________________________________________________________________
activation_62 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_62[0][0]  
__________________________________________________________________________________________________
activation_67 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_67[0][0]  
__________________________________________________________________________________________________
average_pooling2d_6 (AveragePoo (None, 17, 17, 768)  0           mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_60 (Conv2D)              (None, 17, 17, 192)  147456      mixed6[0][0]                     
__________________________________________________________________________________________________
conv2d_63 (Conv2D)              (None, 17, 17, 192)  258048      activation_62[0][0]              
__________________________________________________________________________________________________
conv2d_68 (Conv2D)              (None, 17, 17, 192)  258048      activation_67[0][0]              
__________________________________________________________________________________________________
conv2d_69 (Conv2D)              (None, 17, 17, 192)  147456      average_pooling2d_6[0][0]        
__________________________________________________________________________________________________
batch_normalization_v1_60 (Batc (None, 17, 17, 192)  576         conv2d_60[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_63 (Batc (None, 17, 17, 192)  576         conv2d_63[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_68 (Batc (None, 17, 17, 192)  576         conv2d_68[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_69 (Batc (None, 17, 17, 192)  576         conv2d_69[0][0]                  
__________________________________________________________________________________________________
activation_60 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_60[0][0]  
__________________________________________________________________________________________________
activation_63 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_63[0][0]  
__________________________________________________________________________________________________
activation_68 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_68[0][0]  
__________________________________________________________________________________________________
activation_69 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_69[0][0]  
__________________________________________________________________________________________________
mixed7 (Concatenate)            (None, 17, 17, 768)  0           activation_60[0][0]              
                                                                 activation_63[0][0]              
                                                                 activation_68[0][0]              
                                                                 activation_69[0][0]              
__________________________________________________________________________________________________
conv2d_72 (Conv2D)              (None, 17, 17, 192)  147456      mixed7[0][0]                     
__________________________________________________________________________________________________
batch_normalization_v1_72 (Batc (None, 17, 17, 192)  576         conv2d_72[0][0]                  
__________________________________________________________________________________________________
activation_72 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_72[0][0]  
__________________________________________________________________________________________________
conv2d_73 (Conv2D)              (None, 17, 17, 192)  258048      activation_72[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_73 (Batc (None, 17, 17, 192)  576         conv2d_73[0][0]                  
__________________________________________________________________________________________________
activation_73 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_73[0][0]  
__________________________________________________________________________________________________
conv2d_70 (Conv2D)              (None, 17, 17, 192)  147456      mixed7[0][0]                     
__________________________________________________________________________________________________
conv2d_74 (Conv2D)              (None, 17, 17, 192)  258048      activation_73[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_70 (Batc (None, 17, 17, 192)  576         conv2d_70[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_74 (Batc (None, 17, 17, 192)  576         conv2d_74[0][0]                  
__________________________________________________________________________________________________
activation_70 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_70[0][0]  
__________________________________________________________________________________________________
activation_74 (Activation)      (None, 17, 17, 192)  0           batch_normalization_v1_74[0][0]  
__________________________________________________________________________________________________
conv2d_71 (Conv2D)              (None, 8, 8, 320)    552960      activation_70[0][0]              
__________________________________________________________________________________________________
conv2d_75 (Conv2D)              (None, 8, 8, 192)    331776      activation_74[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_71 (Batc (None, 8, 8, 320)    960         conv2d_71[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_75 (Batc (None, 8, 8, 192)    576         conv2d_75[0][0]                  
__________________________________________________________________________________________________
activation_71 (Activation)      (None, 8, 8, 320)    0           batch_normalization_v1_71[0][0]  
__________________________________________________________________________________________________
activation_75 (Activation)      (None, 8, 8, 192)    0           batch_normalization_v1_75[0][0]  
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)  (None, 8, 8, 768)    0           mixed7[0][0]                     
__________________________________________________________________________________________________
mixed8 (Concatenate)            (None, 8, 8, 1280)   0           activation_71[0][0]              
                                                                 activation_75[0][0]              
                                                                 max_pooling2d_3[0][0]            
__________________________________________________________________________________________________
conv2d_80 (Conv2D)              (None, 8, 8, 448)    573440      mixed8[0][0]                     
__________________________________________________________________________________________________
batch_normalization_v1_80 (Batc (None, 8, 8, 448)    1344        conv2d_80[0][0]                  
__________________________________________________________________________________________________
activation_80 (Activation)      (None, 8, 8, 448)    0           batch_normalization_v1_80[0][0]  
__________________________________________________________________________________________________
conv2d_77 (Conv2D)              (None, 8, 8, 384)    491520      mixed8[0][0]                     
__________________________________________________________________________________________________
conv2d_81 (Conv2D)              (None, 8, 8, 384)    1548288     activation_80[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_77 (Batc (None, 8, 8, 384)    1152        conv2d_77[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_81 (Batc (None, 8, 8, 384)    1152        conv2d_81[0][0]                  
__________________________________________________________________________________________________
activation_77 (Activation)      (None, 8, 8, 384)    0           batch_normalization_v1_77[0][0]  
__________________________________________________________________________________________________
activation_81 (Activation)      (None, 8, 8, 384)    0           batch_normalization_v1_81[0][0]  
__________________________________________________________________________________________________
conv2d_78 (Conv2D)              (None, 8, 8, 384)    442368      activation_77[0][0]              
__________________________________________________________________________________________________
conv2d_79 (Conv2D)              (None, 8, 8, 384)    442368      activation_77[0][0]              
__________________________________________________________________________________________________
conv2d_82 (Conv2D)              (None, 8, 8, 384)    442368      activation_81[0][0]              
__________________________________________________________________________________________________
conv2d_83 (Conv2D)              (None, 8, 8, 384)    442368      activation_81[0][0]              
__________________________________________________________________________________________________
average_pooling2d_7 (AveragePoo (None, 8, 8, 1280)   0           mixed8[0][0]                     
__________________________________________________________________________________________________
conv2d_76 (Conv2D)              (None, 8, 8, 320)    409600      mixed8[0][0]                     
__________________________________________________________________________________________________
batch_normalization_v1_78 (Batc (None, 8, 8, 384)    1152        conv2d_78[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_79 (Batc (None, 8, 8, 384)    1152        conv2d_79[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_82 (Batc (None, 8, 8, 384)    1152        conv2d_82[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_83 (Batc (None, 8, 8, 384)    1152        conv2d_83[0][0]                  
__________________________________________________________________________________________________
conv2d_84 (Conv2D)              (None, 8, 8, 192)    245760      average_pooling2d_7[0][0]        
__________________________________________________________________________________________________
batch_normalization_v1_76 (Batc (None, 8, 8, 320)    960         conv2d_76[0][0]                  
__________________________________________________________________________________________________
activation_78 (Activation)      (None, 8, 8, 384)    0           batch_normalization_v1_78[0][0]  
__________________________________________________________________________________________________
activation_79 (Activation)      (None, 8, 8, 384)    0           batch_normalization_v1_79[0][0]  
__________________________________________________________________________________________________
activation_82 (Activation)      (None, 8, 8, 384)    0           batch_normalization_v1_82[0][0]  
__________________________________________________________________________________________________
activation_83 (Activation)      (None, 8, 8, 384)    0           batch_normalization_v1_83[0][0]  
__________________________________________________________________________________________________
batch_normalization_v1_84 (Batc (None, 8, 8, 192)    576         conv2d_84[0][0]                  
__________________________________________________________________________________________________
activation_76 (Activation)      (None, 8, 8, 320)    0           batch_normalization_v1_76[0][0]  
__________________________________________________________________________________________________
mixed9_0 (Concatenate)          (None, 8, 8, 768)    0           activation_78[0][0]              
                                                                 activation_79[0][0]              
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 8, 8, 768)    0           activation_82[0][0]              
                                                                 activation_83[0][0]              
__________________________________________________________________________________________________
activation_84 (Activation)      (None, 8, 8, 192)    0           batch_normalization_v1_84[0][0]  
__________________________________________________________________________________________________
mixed9 (Concatenate)            (None, 8, 8, 2048)   0           activation_76[0][0]              
                                                                 mixed9_0[0][0]                   
                                                                 concatenate[0][0]                
                                                                 activation_84[0][0]              
__________________________________________________________________________________________________
conv2d_89 (Conv2D)              (None, 8, 8, 448)    917504      mixed9[0][0]                     
__________________________________________________________________________________________________
batch_normalization_v1_89 (Batc (None, 8, 8, 448)    1344        conv2d_89[0][0]                  
__________________________________________________________________________________________________
activation_89 (Activation)      (None, 8, 8, 448)    0           batch_normalization_v1_89[0][0]  
__________________________________________________________________________________________________
conv2d_86 (Conv2D)              (None, 8, 8, 384)    786432      mixed9[0][0]                     
__________________________________________________________________________________________________
conv2d_90 (Conv2D)              (None, 8, 8, 384)    1548288     activation_89[0][0]              
__________________________________________________________________________________________________
batch_normalization_v1_86 (Batc (None, 8, 8, 384)    1152        conv2d_86[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_90 (Batc (None, 8, 8, 384)    1152        conv2d_90[0][0]                  
__________________________________________________________________________________________________
activation_86 (Activation)      (None, 8, 8, 384)    0           batch_normalization_v1_86[0][0]  
__________________________________________________________________________________________________
activation_90 (Activation)      (None, 8, 8, 384)    0           batch_normalization_v1_90[0][0]  
__________________________________________________________________________________________________
conv2d_87 (Conv2D)              (None, 8, 8, 384)    442368      activation_86[0][0]              
__________________________________________________________________________________________________
conv2d_88 (Conv2D)              (None, 8, 8, 384)    442368      activation_86[0][0]              
__________________________________________________________________________________________________
conv2d_91 (Conv2D)              (None, 8, 8, 384)    442368      activation_90[0][0]              
__________________________________________________________________________________________________
conv2d_92 (Conv2D)              (None, 8, 8, 384)    442368      activation_90[0][0]              
__________________________________________________________________________________________________
average_pooling2d_8 (AveragePoo (None, 8, 8, 2048)   0           mixed9[0][0]                     
__________________________________________________________________________________________________
conv2d_85 (Conv2D)              (None, 8, 8, 320)    655360      mixed9[0][0]                     
__________________________________________________________________________________________________
batch_normalization_v1_87 (Batc (None, 8, 8, 384)    1152        conv2d_87[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_88 (Batc (None, 8, 8, 384)    1152        conv2d_88[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_91 (Batc (None, 8, 8, 384)    1152        conv2d_91[0][0]                  
__________________________________________________________________________________________________
batch_normalization_v1_92 (Batc (None, 8, 8, 384)    1152        conv2d_92[0][0]                  
__________________________________________________________________________________________________
conv2d_93 (Conv2D)              (None, 8, 8, 192)    393216      average_pooling2d_8[0][0]        
__________________________________________________________________________________________________
batch_normalization_v1_85 (Batc (None, 8, 8, 320)    960         conv2d_85[0][0]                  
__________________________________________________________________________________________________
activation_87 (Activation)      (None, 8, 8, 384)    0           batch_normalization_v1_87[0][0]  
__________________________________________________________________________________________________
activation_88 (Activation)      (None, 8, 8, 384)    0           batch_normalization_v1_88[0][0]  
__________________________________________________________________________________________________
activation_91 (Activation)      (None, 8, 8, 384)    0           batch_normalization_v1_91[0][0]  
__________________________________________________________________________________________________
activation_92 (Activation)      (None, 8, 8, 384)    0           batch_normalization_v1_92[0][0]  
__________________________________________________________________________________________________
batch_normalization_v1_93 (Batc (None, 8, 8, 192)    576         conv2d_93[0][0]                  
__________________________________________________________________________________________________
activation_85 (Activation)      (None, 8, 8, 320)    0           batch_normalization_v1_85[0][0]  
__________________________________________________________________________________________________
mixed9_1 (Concatenate)          (None, 8, 8, 768)    0           activation_87[0][0]              
                                                                 activation_88[0][0]              
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 8, 8, 768)    0           activation_91[0][0]              
                                                                 activation_92[0][0]              
__________________________________________________________________________________________________
activation_93 (Activation)      (None, 8, 8, 192)    0           batch_normalization_v1_93[0][0]  
__________________________________________________________________________________________________
mixed10 (Concatenate)           (None, 8, 8, 2048)   0           activation_85[0][0]              
                                                                 mixed9_1[0][0]                   
                                                                 concatenate_1[0][0]              
                                                                 activation_93[0][0]              
__________________________________________________________________________________________________
avg_pool (GlobalAveragePooling2 (None, 2048)         0           mixed10[0][0]                    
__________________________________________________________________________________________________
predictions (Dense)             (None, 1000)         2049000     avg_pool[0][0]                   
==================================================================================================
Total params: 23,851,784
Trainable params: 0
Non-trainable params: 23,851,784
__________________________________________________________________________________________________
In [8]:
y_hat = base_model.predict(X)
In [9]:
imagenet_labels = decode_predictions(y_hat, top=3)
In [10]:
print('Predicted:', imagenet_labels)
Predicted: [[('n04273569', 'speedboat', 0.23419371), ('n03673027', 'liner', 0.21059692), ('n02981792', 'catamaran', 0.16518675)], [('n03673027', 'liner', 0.9226364), ('n02981792', 'catamaran', 0.013785596), ('n09428293', 'seashore', 0.0009258849)], [('n04273569', 'speedboat', 0.78583926), ('n02704792', 'amphibian', 0.07139961), ('n02981792', 'catamaran', 0.06575323)], [('n02504458', 'African_elephant', 0.5244344), ('n02504013', 'Indian_elephant', 0.24280018), ('n01871265', 'tusker', 0.1114159)], [('n04285008', 'sports_car', 0.88917637), ('n03100240', 'convertible', 0.033024576), ('n02974003', 'car_wheel', 0.017411336)], [('n03770679', 'minivan', 0.5996617), ('n04285008', 'sports_car', 0.02500174), ('n02814533', 'beach_wagon', 0.020977488)]]

And here is the result. At first sight, it's surprising that the 1st image has only 23% of confidence compared to the 2 other boat images. We can also see that the twingo is considered as Minivan. This is basically due to rescale which modifies proportions too much. And finnaly, the cat is classified as Affrican Elephant with more than 50% confidence.

Let's now look at the top-5 instead of top-1

In [11]:
idx = np.argmax(y_hat, axis=1)   # imagenet label is at the end of the notebook
percent = 100*np.max(y_hat, axis=1)

fig, axes = plt.subplots(2, 3, figsize=(20,12))
axes[0, 0].imshow(test_image1)
axes[0, 0].set_title("{} ({:.1f}%)".format(imagenet_labels[0][0][1], 100*imagenet_labels[0][0][2]))
axes[0, 1].imshow(test_image2)
axes[0, 1].set_title("{} ({:.1f}%)".format(imagenet_labels[1][0][1], 100*imagenet_labels[1][0][2]))
axes[0, 2].imshow(test_image3)
axes[0, 2].set_title("{} ({:.1f}%)".format(imagenet_labels[2][0][1], 100*imagenet_labels[2][0][2]))
axes[1, 0].imshow(test_image4)
axes[1, 0].set_title("{} ({:.1f}%)".format(imagenet_labels[3][0][1], 100*imagenet_labels[3][0][2]))
axes[1, 1].imshow(test_image5)
axes[1, 1].set_title("{} ({:.1f}%)".format(imagenet_labels[4][0][1], 100*imagenet_labels[4][0][2]))
axes[1, 2].imshow(test_image6)
axes[1, 2].set_title("{} ({:.1f}%)".format(imagenet_labels[5][0][1], 100*imagenet_labels[5][0][2]))
plt.show()
In [12]:
res = decode_predictions(y_hat, top=5)
for i, row in enumerate(res):
    print("Image {}".format(i))
    for j, (_, label, percent) in enumerate(row):
        print("\tRank {} - {} ({:.1f}%)".format(j+1, label, 100*percent))
Image 0
	Rank 1 - speedboat (23.4%)
	Rank 2 - liner (21.1%)
	Rank 3 - catamaran (16.5%)
	Rank 4 - trimaran (2.6%)
	Rank 5 - dock (1.7%)
Image 1
	Rank 1 - liner (92.3%)
	Rank 2 - catamaran (1.4%)
	Rank 3 - seashore (0.1%)
	Rank 4 - dock (0.1%)
	Rank 5 - howler_monkey (0.1%)
Image 2
	Rank 1 - speedboat (78.6%)
	Rank 2 - amphibian (7.1%)
	Rank 3 - catamaran (6.6%)
	Rank 4 - trimaran (0.6%)
	Rank 5 - fireboat (0.4%)
Image 3
	Rank 1 - African_elephant (52.4%)
	Rank 2 - Indian_elephant (24.3%)
	Rank 3 - tusker (11.1%)
	Rank 4 - buckle (0.6%)
	Rank 5 - nail (0.6%)
Image 4
	Rank 1 - sports_car (88.9%)
	Rank 2 - convertible (3.3%)
	Rank 3 - car_wheel (1.7%)
	Rank 4 - grille (1.4%)
	Rank 5 - racer (0.3%)
Image 5
	Rank 1 - minivan (60.0%)
	Rank 2 - sports_car (2.5%)
	Rank 3 - beach_wagon (2.1%)
	Rank 4 - car_wheel (1.1%)
	Rank 5 - grille (0.8%)

For the first boat, we can see some hesitation between speedboat and liner. This is also something tricky for humans. However for the case of the cat, 88% of the cumulative confidence is about elephants... 12% is balanced over 997 other classes. Let's now look at the confidence of the model about the fact that the image is a cat.

In [13]:
imagenet_sublabels = {
    281: 'tabby, tabby cat',
    282: 'tiger cat',
    283: 'Persian cat',
    284: 'Siamese cat, Siamese',
    285: 'Egyptian cat',
    286: 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor',
    287: 'lynx, catamount',
    288: 'leopard, Panthera pardus',
    289: 'snow leopard, ounce, Panthera uncia',
    290: 'jaguar, panther, Panthera onca, Felis onca',
    291: 'lion, king of beasts, Panthera leo',
    292: 'tiger, Panthera tigris',
}
In [14]:
res2 = np.argsort(y_hat[3])[::-1]
ans = []
for i in range(281, 286):
    percent = 100*y_hat[3, i]
    label = imagenet_sublabels[i]
    ranking = np.argwhere(res2 == i)[0][0]
    ans.append((ranking, label, percent))

ans.sort(key=lambda x:x[0])

for ranking, label, percent in ans:
    print("\tRank {} - {} ({:.5f}%)".format(ranking, label, percent))
	Rank 48 - Persian cat (0.03445%)
	Rank 343 - Siamese cat, Siamese (0.00703%)
	Rank 681 - tabby, tabby cat (0.00332%)
	Rank 771 - tiger cat (0.00266%)
	Rank 981 - Egyptian cat (0.00082%)

The model see a cat for the first time is rank 48... with a confidence of 0.03% !! That confirms a CNN is really good at detecting patterns and not shapes

Feature Extraction

Quickly, we cann look at the feature vector for each image

In [15]:
out = base_model.get_layer("avg_pool").output
inp = base_model.input

get_output = tf.keras.backend.function([inp], [out])
In [16]:
features = get_output(X)[0]
In [17]:
features.shape
Out[17]:
(6, 2048)

The model outputs a 2048 feqture vector, to easy visibility, they will be reshaped to 64x32 pixel image and plot with the same scale

In [18]:
vmin, vmax = features.min(), features.max()
In [19]:
fig, axes = plt.subplots(3, 2, figsize=(20,12))
axes[0, 0].imshow(features[0].reshape(32, 64), vmin=vmin, vmax=vmax)
axes[0, 1].imshow(features[1].reshape(32, 64), vmin=vmin, vmax=vmax)
axes[1, 0].imshow(features[2].reshape(32, 64), vmin=vmin, vmax=vmax)
axes[1, 1].imshow(features[3].reshape(32, 64), vmin=vmin, vmax=vmax)
axes[2, 0].imshow(features[4].reshape(32, 64), vmin=vmin, vmax=vmax)
im = axes[2, 1].imshow(features[5].reshape(32, 64), vmin=vmin, vmax=vmax)
fig.colorbar(im, ax=axes.flat)  # https://stackoverflow.com/questions/13784201/matplotlib-2-subplots-1-colorbar

axes[0, 0].set_title("Feature activation of Image 1")
axes[0, 1].set_title("Feature activation of Image 2")
axes[1, 0].set_title("Feature activation of Image 3")
axes[1, 1].set_title("Feature activation of Image 4")
axes[2, 0].set_title("Feature activation of Image 5")
axes[2, 1].set_title("Feature activation of Image 6")
plt.show()

Basically, it's difficult to spot difference here (except for the image 3 which look brighter). However, we can computer the Jenson Shannon Divergence to have distance between those vectors

In [20]:
def JSD(P, Q):
    _P = P / norm(P, ord=1)
    _Q = Q / norm(Q, ord=1)
    _M = 0.5 * (_P + _Q)
    return 0.5 * (entropy(_P, _M) + entropy(_Q, _M))

rel = [(0, 1), (0, 2), (1, 2), (0, 3), (4, 5)]
for image_a, image_b in rel:
    jsd = JSD(features[image_a].flatten(), features[image_b].flatten())
    print("Entropy between image {} and {} is {:.5f}".format(image_a, image_b, jsd))
Entropy between image 0 and 1 is 0.09621
Entropy between image 0 and 2 is 0.06595
Entropy between image 1 and 2 is 0.14012
Entropy between image 0 and 3 is 0.27313
Entropy between image 4 and 5 is 0.15157

As expected, the distance between the boat 1 and 2 is very small (at least smaller than the distance with the boat 3). The cat is far from boats (logically) and both cars are quite similar but not fully. Now we can have a look at Heatmap of the area which support the prediction

Heatmap

Principle

In classification, the "extractor" of the CNN (in this example the Inception V3) provides a 8 x 8 x 2048 tensor. This tensor is reduced to 1 x 1 x 2048 by using a Average Pooling Layer (or Max Pool Layer on some cases). This vector is then sent to a Fully connected layer for the classification. 1 simple level of Fully-Connected Layer is equivalent to one Linear Regression per class (without considering the activation). In Transfer-Learning, we basically retrain simply those Linear Models. This can be represented as follow :

The trick to have the heatmap is to reduce the 2048 feature to 1 by using the weight of the linear model (that means, this cannot be done with more complex clasifiers (multiple levels of FC layers for example). By conputing the "reglression over each of the 8 x 8 set of pixel, we can have an idea of what gives the more "force of the classification decision. This can be then represented as follow :

Mathematically, this can be seens as a swap of "sum" :

  • Classification

$$ PredictionValue_{class} = \sum_{i=1}^{2048} {\omega_{class, i}} * (\bar{M_{i}}) $$

with

$$ \forall i \in [1, 2048] \Rightarrow \bar{M_i} = \frac{1}{64} \times \sum_{j=1}^{8} (\sum_{k=1}^{8} {Feature_{j, k, i}}) $$

Instead in CAM we do :

$$ \forall j, k \in [1, 8] \Rightarrow CAM_{class}[j, k] = \sum_{i=1}^{2048} \omega_{class, i} * Feature_{j, k, i } $$

and we could go back to PredictionValue of this class by doing

$$ PredictionValue_{class} = \frac{1}{64} \times \sum_{j=1}^{8} (\sum_{k=1}^{8} {CAM_{class}[j, k]}) $$

In this model we have a 8 x 8 feature size for a 299 x 299 image size. That means 1 feature "pixel" will contain the information of 37.5 pixels. Usually, to ease the visualization, we apply on this CAM matrix an interpolation.

After the theory, let's do the practice

Application

First let's extract the Feature tensor

In [21]:
out = base_model.get_layer("mixed10").output
inp = base_model.input

get_output = tf.keras.backend.function([inp], [out])
In [22]:
conv_outputs = get_output(X)[0]
conv_outputs.shape
Out[22]:
(6, 8, 8, 2048)

As our model is trained, we can directly get the weight. In case of Transfer learning, the fit has to be done upfront

In [23]:
class_weights = base_model.layers[-1].get_weights()[0]
class_weights.shape
Out[23]:
(2048, 1000)

Let's now compute the heatmap for 3 images and the top 3 of classes

In [24]:
idx = np.argsort(y_hat, axis=1)[:, -3:]
idx = idx[:, ::-1]
In [25]:
def get_heatmap(image_idx, label_idx):
#     cam = np.zeros(dtype = np.float32, shape = (8, 8))
#     for i, w in enumerate(class_weights[:, label_idx]):
#         cam += w * conv_outputs[image_idx, :, :, i]
    cam = (conv_outputs[image_idx] * class_weights[:, label_idx].reshape(1, 1, -1)).sum(axis=2)
    cam /= np.max(cam)
    cam = cv2.resize(cam, (299, 299))
    heatmap = cv2.applyColorMap(np.uint8(255*(1-cam)), cv2.COLORMAP_JET)
    heatmap[np.where(cam < 0.2)] = 0
    return heatmap
In [26]:
fig, axes = plt.subplots(3, 3, figsize=(20,18))
for row, image_idx in enumerate([0, 1, 3]):
    for col, label_idx in enumerate(idx[image_idx]):
        heatmap = get_heatmap(image_idx, label_idx)
        img = 0.5 * heatmap + 0.5 * 255.0 * X[image_idx]
        pred_class = res[image_idx][col][1]
        axes[row, col].imshow(img/255.0)
        axes[row, col].set_title("CAM for label {}".format(pred_class))
plt.show()

And for the cat image, let's look at activation map for the classes of cat.

In [27]:
image_idx = 3
fig, axes = plt.subplots(1, 5, figsize=(20,20))
for col, label_idx in enumerate(range(281, 286)):
    heatmap = get_heatmap(image_idx, label_idx)
    img = 0.5 * heatmap + 0.5 * 255.0 * X[image_idx]
    pred_class = imagenet_sublabels[label_idx]
    axes[col].imshow(img/255.0)
    axes[col].set_title("CAM for label {}".format(pred_class))
plt.show()

The last image is strange but this is due to the cubic interpolation baceause pixels of same colors are not grouped

Conclusion

In this Notebook, I wanted to apply the CAM on a pre-trained model and took the opportunity to play with the discovery with texture vs shape. This kind of visual is very interesting to check if the model trained overfits or not (just by looking if the model is really using the right area to make the prediction). . However, the drawback of this method is that it cannot be used with more complex calssifier than liear because we need the importance of each feature in the prediction. A multi layer classifier cannot be used or also sklearn models. I only tried it by:

  • keeping the features
  • reduce dimensions with PCA
  • Train a Linear Model
  • Invert the PCA with the weight of the Linear Model (to have weight is the initail space state)
  • use those weight to build the activation map

This worked well too (and even better when you have a smaller dataset as you avoid overfitting).